C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs

نویسندگان

Shuo Wang

Zhe Li

Caiwen Ding

Bo Yuan

Qinru Qiu

Yanzhi Wang

Yun Liang

چکیده

Recently, significant accuracy improvement has been achieved for acoustic recognition systems by increasing the model size of Long Short-Term Memory (LSTM) networks. Unfortunately, the everincreasing size of LSTMmodel leads to inefficient designs on FPGAs due to the limited on-chip resources. The previous work proposes to use a pruning based compression technique to reduce themodel size and thus speedups the inference on FPGAs. However, the random nature of the pruning technique transforms the dense matrices of the model to highly unstructured sparse ones, which leads to unbalanced computation and irregular memory accesses and thus hurts the overall performance and energy efficiency. In contrast, we propose to use a structured compression technique which could not only reduce the LSTM model size but also eliminate the irregularities of computation and memory accesses. This approach employs block-circulant instead of sparse matrices to compress weight matrices and reduces the storage requirement from O(k2) to O(k). Fast Fourier Transform algorithm is utilized to further accelerate the inference by reducing the computational complexity from O(k2) to O(klogk). The datapath and activation functions are quantized as 16-bit to improve the resource utilization. More importantly, we propose a comprehensive framework called C-LSTM to automatically optimize and implement a wide range of LSTM variants on FPGAs. According to the experimental results, C-LSTM achieves up to 18.8X and 33.5X gains for performance and energy efficiency compared with the state-of-the-art LSTM implementation under the same experimental setup, and the accuracy degradation is very small.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simulate Congestion Prediction in a Wireless Network Using the LSTM Deep Learning Model

Achieved wireless networks since its beginning the prevalent wide due to the increasing wireless devices represented by smart phones and laptop, and the proliferation of networks coincides with the high speed and ease of use of the Internet and enjoy the delivery of various data such as video clips and games. Here's the show the congestion problem arises and represent aim of the research is t...

متن کامل

Extending Long Short-Term Memory for Multi-View Structured Learning

Long Short-Term Memory (LSTM) networks have been successfully applied to a number of sequence learning problems but they lack the design flexibility to model multiple view interactions, limiting their ability to exploit multi-view relationships. In this paper, we propose a Multi-View LSTM (MV-LSTM), which explicitly models the view-specific and cross-view interactions over time or structured ou...

متن کامل

Effective Quantization Approaches for Recurrent Neural Networks

Deep learning, Recurrent Neural Networks (RNN) in particular have shown superior accuracy in a large variety of tasks including machine translation, language understanding, and movie frames generation. However, these deep learning approaches are very expensive in terms of computation. In most cases, Graphic Processing Units (GPUs) are in used for large scale implementations. Meanwhile, energy e...

متن کامل

Long Short-term Memory Network over Rhetorical Structure Theory for Sentence-level Sentiment Analysis

Using deep learning models to solve sentiment analysis of sentences is still a challenging task. Long short-term memory (LSTM) network solves the gradient disappeared problem existed in recurrent neural network (RNN), but LSTM structure is linear chain-structure that can’t capture text structure information. Afterwards, Tree-LSTM is proposed, which uses LSTM forget gate to skip sub-trees that h...

متن کامل

Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition

In this paper, a novel architecture for a deep recurrent neural network, residual LSTM is introduced. A plain LSTM has an internal memory cell that can learn long term dependencies of sequential data. It also provides a temporal shortcut path to avoid vanishing or exploding gradients in the temporal domain. The residual LSTM provides an additional spatial shortcut path from lower layers for eff...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs

نویسندگان

چکیده

منابع مشابه

Simulate Congestion Prediction in a Wireless Network Using the LSTM Deep Learning Model

Extending Long Short-Term Memory for Multi-View Structured Learning

Effective Quantization Approaches for Recurrent Neural Networks

Long Short-term Memory Network over Rhetorical Structure Theory for Sentence-level Sentiment Analysis

Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition

عنوان ژورنال:

اشتراک گذاری